Improved exponential type ratio estimator in double sampling for stratification

The objective of this research is to create a chain-ratio-type exponential estimator in order to estimate the finite population mean in double sampling for stratification. An estimator for population mean has been constructed based on the concept of chain-ratio estimators. The constructed estimator is compared to the standard unbiased estimator, as well as the other relevant existing estimators and conditions are shown to yield better results in terms of efficiency. To support the theoretical results the study has been done on both natural as well as simulated populations.


Anurag Gupta * , Rajesh Tailor & Nitu Barod
The objective of this research is to create a chain-ratio-type exponential estimator in order to estimate the finite population mean in double sampling for stratification.An estimator for population mean has been constructed based on the concept of chain-ratio estimators.The constructed estimator is compared to the standard unbiased estimator, as well as the other relevant existing estimators and conditions are shown to yield better results in terms of efficiency.To support the theoretical results the study has been done on both natural as well as simulated populations.
Stratified random sampling is a commonly used approach in sampling, In recent years, significant advancements have been made in the realm of stratified sampling estimators, with a particular focus on integrating innovative techniques like L-moments.L-moments, an extension of conventional moments, offer a robust approach for characterizing the shape and scale of probability distributions.When applied to stratified sampling, L-moments provide a nuanced understanding of the underlying data distribution within strata, enabling more accurate and efficient estimation of population parameters.The utilization of L-moments in stratified sampling estimators represents a cutting-edge methodology that enhances the precision of estimations, especially in scenarios where the data may exhibit non-normality or complex distribution patterns.(for instance see Hosking 1 and Shahzad et al. 2 .
Stratified random sampling is particularly used when there is prior knowledge about the sampling frame and strata weights.However, in many situations, obtaining up-to-date information on strata weights can be challenging due to the addition or deletion of units to the population.For instance, studying the socio-economic status of people in a particular region becomes difficult and expensive due to factors like immigration, emigration, and other demographic changes that affect the strata sizes consequently strata weights.
To address this issue, double sampling for stratification is often employed as an alternative to stratified random sampling.In double sampling for stratification, a large sample is initially selected, which is then divided into homogeneous strata to estimate the strata weights.From each stratum, a sample is selected using simple random sampling without replacement, and both study and auxiliary variables are observed.
Double sampling for stratification is a widely used sampling design in forest and resource inventory, particularly in forest ecosystems.For instance, Lam et al. 3 applied double sampling for stratification in monitoring sparse tree populations in Chinese forests.This approach is cost-effective and robust.
The concept of double sampling traces its origins to Neyman 4 , when he first developed it to gather data on strata weights in stratified sampling.Rao 5 extended its application to address non-response issues and analytical comparisons.Ige and Tripathi 6 proposed alternative sampling strategies based on double Sampling for Stratification (DSS), utilizing auxiliary information from the first-phase sample in both survey design and estimation.
This work led the way for Singh and Vishwakarma 7 to introduce a general procedure for estimating population means using double sampling for stratification and auxiliary information.Tailor et al. 8 built upon the foundation laid by Ige and Tripathi 6 by exploring ratio-type and product-type exponential estimators.
For further research in this field, readers are encouraged to explore the papers by Tailor and Lone 9 , Singh and Nigam 10 , Gupta and Tailor 11 , Lone et al. 12 , and Verma et al. 13 .
Previous research has focused on classical ratio and product estimators for population mean in double sampling for stratification.Motivated by the aforementioned studies, this research introduces a novel approach by developing a chain ratio-type exponential estimator for estimating the population mean in double sampling for stratification.By exploring this new estimator, we aim to contribute to the existing literature and provide an alternative method for population mean estimation in double sampling for stratification.

Procedure for double sampling for stratification and notations
Suppose U = (U 1 , U 2 , . . ., U N ) is a finite population of size N units, which consists of strata weights N h N , (h = 1, 2, 3, . . ., L) .The weights of the population U are unknown.In this scenario double sampling for stratification will be used.
Procedure for double sampling for stratification: a.In the initial phase, a sample S of size n' is drawn using simple random sampling without replacement, and auxiliary variables x and z are recorded.b.The sample is then divided into L strata based on the observed variables x and z.Let n ′ h denote the number of units in each stratum (h = 1, 2, 3, . . ., L) , such that n ′ = L h=1 n ′ h .c. From each stratum with size n ′ h , a sample of size n h = v h n ′ h is drawn, where 0 < v h < 1, (h = 1, 2, 3, . . ., L) .These predetermined probabilities v h determine the sample size n h from each stratum n ′ h .The combined sample S′ is obtained with a total size n = L h=1 n h .In S′, both the study variable y and auxiliary variables x and z are observed.Let y be the study variable and x and z are first and second auxiliary variables respectively and Y , X and Z are population means of variables y, x, and z respectively where , R 1 and R 2 are ratio of two population means, Y h , X h and Z h are hth stratum mean for variable y, x, and z respectively where yh , S 2 xh and S 2 zh are hth stratum population variance of the variable y, x, and z respectively, where S yxh , S yzh and S xzh are hth stratum covariance between the variable y and x, y and z, x and z respectively, where Sampling fraction for sample units selected in second-phase sample.
Population variance of the study variable y, Population variance of the auxiliary variable z, Population covariance between the variable y and z,

Some relevant existing estimators
The unbiased estimator for Y is defined as Cochran 14 ratio estimator was studied in double sampling for stratification procedure by Ige and Tripathi 6 and suggested a ratio estimator as Using exponential function a ratio-type exponential estimator for Y was envisaged by Bahl and Tuteja 15 in simple random sampling as Bahl and Tuteja 15 estimator Ŷ Re was studied by Tailor et al. 8 in double sampling for stratification procedure as Lakhre 16 developed dual to ratio type exponential estimator in case of double sampling for stratification (3.1)

) Ŷ ds
Re = y ds exp where x * ds = N x ′ −n x ds N−n .Lone et al. 17 proposed the alternative of Ige and Tripathi 6 estimator using the dual approach introduced by Srivenkataramana 18 and Bandyopadhyay 19 as Lone et al. 12 worked out dual to ratio-cum-product type estimator in double sampling for stratification motivated by Singh 20 and Lone et al. 17 as

Proposed estimator
Motivated by Ige and Tripathi 6 and Tailor et al. 8 , we have developed an improved exponential type ratio estimator by assuming that Z is known and x ′ is replaced by ratio-estimator x ds = X(1 + e 1 ),

Comparison with relevant estimators
From an efficiency perspective, the proposed estimator is compared to all other estimators discussed in Section "Some relevant existing estimators".The variance of an unbiased estimator, the MSEs of an Ige and Tripathi 6 , Tailor et al. 8 , Lakhre 16 , Lone et al. 17 and Lone et al. 12 estimator are all provided in DSS as www.nature.com/scientificreports/where g = n N−n .when (4.4), (5.1), (5.2), (5.3), (5.4), (5.5) and (5.6) are compared, it is clear that the developed chain ratio type exponential estimator would be more efficient than (i)

Empirical study
In Section "Comparison with relevant estimators", the developed chain ratio type exponential estimator was compared theoretically.In this section numerical illustration is being discussed to show the performance of different considered estimators as well as the proposed estimator practically and the percent relative efficiency (PRE) of the proposed estimator compared to other considered estimators is also shown in Table 2.For this purpose, two data sets have been considered.Description of data set is given below: (5.5) (5.9)

Simulation study
In this section, simulation study has been carried out to observe the performance of the developed estimator as compared to other considered estimators by using R-software.Six different pseudo populations of size N having two strata N 1 and N 2 of equal and unequal sizes have been generated.All the populations are simulated from normal distribution.The values of PRE and MSE of the populations having equal strata sizes are given in Tables 3, 4 respectively and Tables 5 and 6 show PRE and MSE values of the populations having unequal strata sizes.The results of the simulated data sets are also represented with the help of line graphs, where it is clearly shown that the developed estimator has highest PRE and least MSE in each population.
Populations having equal strata size:

Results and discussions
(i) Table 1 demonstrates that among the two real data sets, population 1 meets all the conditions outlined in Section "Comparison with relevant estimators", under which the proposed estimator outperform other considered estimators.In contrast, population 2 fails to fulfill the conditions specified in Eqs.(5.11) and (5.12).Table 2 presents the percent relative efficiency (PRE) of all the considered estimators discussed in Section "Empirical study" as well as the proposed estimator for the two real data sets.The symbol '*'   3 and 4 respectively.Similarly, Tables 5  and 6 show the PRE and MSE values for the next three simulated normal populations having unequal strata sizes.(iii) Figures 1 and 2 show the PRE and MSE of the first three simulated normal populations having equal strata sizes and the next three simulated normal populations having unequal strata sizes respectively.

Conclusion
In this study, we have investigated the problem of estimating the population mean of the study variable.We have proposed a chain ratio-type exponential estimator and examined its properties such as bias and mean squared error up to the first degree of approximation.Our analysis in Section "Comparison with relevant estimators" has established the conditions under which the proposed estimator outperforms all other estimators shown in Section ""Some relevant existing estimators"".The empirical as well as simulation study have been conducted to support the theoretical findings.The proposed estimator found to be more efficient compared to other considered estimators under some conditions given in Section ""Comparison with relevant estimators"" as it has least MSE and highest PRE among other estimators.The results are shown with the help of Tables 1, 2, 3, 4, 5 and 6 and also by using graphs shown in Figs. 1 and 2.
Overall, our research contributes significantly to the theory of estimating the population mean in the context of double sampling for stratification.Therefore, we recommend the application of our proposed estimator for the estimation of population mean in real-life situations.

Percent Relative Efficiency
Mean Squared Error

Figure 2 .
Figure 2. Graphs for percent relative efficiency and mean square error y ds , Ŷ ds R , Ŷ ds Re , Ŷ * Rd , Ŷ * Re , Ŷ * Rpd and Ŷ ds CRe the developed estimator for estimating population mean Y is defined as where x ds = L h=1 w h x h : is unbiased estimator of X in second phase, y ds = h : is unbiased estima- tor of Y in second phase, The expression of bias and MSE of Ŷ ds CRe can be easily find by considering error terms e i in such a way that Such that E (e o ) = E(e 1 ) = E e ′ 1 = E e ′

Table 1 .
Empirical exhibition of theoretical conditions given in Sect."Some relevant existing estimators" "*" conditions not satisfied.

Table 2 .
Percent relative efficiencies of y ds ,

Table 3 .
Percent relative efficiencies of y ds ,

Table 4 .
Mean squared error of y ds ,

Estimators Mean squared error Population 1 Population 2 Population 3
is used to indicate instances where PRE is not applicable, as these conditions, as detailed in Eqs.(5.11) and (5.12), remain unsatisfied, a fact also highlighted in Table1(rows 5 and 6).(ii) The PRE and MSE values of the proposed and other considered estimators for the first three simulated normal populations having equal strata sizes are given in Tables

Table 5 .
Percent relative efficiencies of y ds , Estimators

Table 6 .
Mean squared error of y ds , Estimators

Mean squared error Population 1 Population 2 Population 3
(iv) It is observed from all the tables as well as from all the graphs that the proposed estimator has least MSE and highest PRE among other considered estimators which indicates that the proposed estimator will perform better for practical purpose compared to other considered estimators such as y ds , Rpd under the conditions given in Section "Comparison with relevant estimators".